90 research outputs found
Kinship Verification from Videos using Spatio-Temporal Texture Features and Deep Learning
Automatic kinship verification using facial images is a relatively new and
challenging research problem in computer vision. It consists in automatically
predicting whether two persons have a biological kin relation by examining
their facial attributes. While most of the existing works extract shallow
handcrafted features from still face images, we approach this problem from
spatio-temporal point of view and explore the use of both shallow texture
features and deep features for characterizing faces. Promising results,
especially those of deep features, are obtained on the benchmark UvA-NEMO Smile
database. Our extensive experiments also show the superiority of using videos
over still images, hence pointing out the important role of facial dynamics in
kinship verification. Furthermore, the fusion of the two types of features
(i.e. shallow spatio-temporal texture features and deep features) shows
significant performance improvements compared to state-of-the-art methods.Comment: 7 page
A Cyberpunk 2077 perspective on the prediction and understanding of future technology
Science fiction and video games have long served as valuable tools for
envisioning and inspiring future technological advancements. This position
paper investigates the potential of Cyberpunk 2077, a popular science fiction
video game, to shed light on the future of technology, particularly in the
areas of artificial intelligence, edge computing, augmented humans, and
biotechnology. By analyzing the game's portrayal of these technologies and
their implications, we aim to understand the possibilities and challenges that
lie ahead. We discuss key themes such as neurolink and brain-computer
interfaces, multimodal recording systems, virtual and simulated reality,
digital representation of the physical world, augmented and AI-based home
appliances, smart clothing, and autonomous vehicles. The paper highlights the
importance of designing technologies that can coexist with existing preferences
and systems, considering the uneven adoption of new technologies. Through this
exploration, we emphasize the potential of science fiction and video games like
Cyberpunk 2077 as tools for guiding future technological advancements and
shaping public perception of emerging innovations.Comment: 12 pages, 7 figure
Journal of Real-Time Image Processing manuscript No. (will be inserted by the editor) Evaluation of real-time LBP computing in multiple architectures
Abstract Local Binary Pattern (LBP) is a texture operator that is used in several different computer vision applications requiring, in many cases, real-time operation in multiple computing platforms. The irruption of new video standards has increased the typical resolutions and frame rates, which need considerable computational performance. Since LBP is essentially a pixel operator that scales with image size, typical straightforward implementations are usually insufficient to meet these requirements. To identify the solutions that maximize the performance of the real-time LBP extraction, we compare a series different implementations in terms of computational performance and energy efficiency while analyzing the different optimizations that can be made to reach real-time performance on multiple platforms and their different available computing resources. Our contribution addresses the extensive survey of LBP implementations in different platforms that can be found in the literature. To provide for a more complete evaluation, we have implemented the LBP algorithms in several platforms such as Graphics Processing Units, mobile processors and a hybrid programming model image coprocessor. We have extended the evaluation of some of the solutions that can be found in previous work. In addition, we publish the source code of our implementations
Diseño y ensayo de férulas personalizadas mediante impresión 3D.
En este proyecto se trata de llevar un paso más a allá el modelo actual de férula utilizado actualmente cuando se produce una fractura, por ello intentamos sustituir el material utilizado, yeso París, el cual se usa fundamentalmente ya que es un material con un bajo precio y una alta adaptabilidad a la forma de la extremidad en la que la queremos aplicar y en la que se ha producido la rotura. Debido a estas cualidades pensamos que los materiales imprimidos en 3D pueden suplir bastante bien estas características, ya que los elementos fabricados mediante este proceso tienen un muy bajo coste siendo el material fundamental utilizado el plástico PLA con el cual podríamos conseguir una rigidez similar a la del yeso una vez endurecido, con todas estas premisas se va a proceder a realizar una serie de ensayos y de iteraciones buscando encontrar una férula imprimida en 3D de PLA que se adapte lo mejor posible a la extremidad deseada y que tenga una resistencia suficiente para fijar el hueso, además el uso de este material para órtesis también presenta una serie de ventajas tales como una mayor ligereza y facilidad de movimiento por parte del usuario, ya que el plástico se puede mojar, permite una mayor movilidad además de otra series de características que iremos viendo a lo largo del proyecto.Universidad de Sevilla. Máster Universitario en Ingeniería Industria
MAMAF-Net: Motion-Aware and Multi-Attention Fusion Network for Stroke Diagnosis
Stroke is a major cause of mortality and disability worldwide from which one
in four people are in danger of incurring in their lifetime. The pre-hospital
stroke assessment plays a vital role in identifying stroke patients accurately
to accelerate further examination and treatment in hospitals. Accordingly, the
National Institutes of Health Stroke Scale (NIHSS), Cincinnati Pre-hospital
Stroke Scale (CPSS) and Face Arm Speed Time (F.A.S.T.) are globally known tests
for stroke assessment. However, the validity of these tests is skeptical in the
absence of neurologists. Therefore, in this study, we propose a motion-aware
and multi-attention fusion network (MAMAF-Net) that can detect stroke from
multimodal examination videos. Contrary to other studies on stroke detection
from video analysis, our study for the first time proposes an end-to-end
solution from multiple video recordings of each subject with a dataset
encapsulating stroke, transient ischemic attack (TIA), and healthy controls.
The proposed MAMAF-Net consists of motion-aware modules to sense the mobility
of patients, attention modules to fuse the multi-input video data, and 3D
convolutional layers to perform diagnosis from the attention-based extracted
features. Experimental results over the collected StrokeDATA dataset show that
the proposed MAMAF-Net achieves a successful detection of stroke with 93.62%
sensitivity and 95.33% AUC score
Improving Depression estimation from facial videos with face alignment, training optimization and scheduling
Deep learning models have shown promising results in recognizing depressive
states using video-based facial expressions. While successful models typically
leverage using 3D-CNNs or video distillation techniques, the different use of
pretraining, data augmentation, preprocessing, and optimization techniques
across experiments makes it difficult to make fair architectural comparisons.
We propose instead to enhance two simple models based on ResNet-50 that use
only static spatial information by using two specific face alignment methods
and improved data augmentation, optimization, and scheduling techniques. Our
extensive experiments on benchmark datasets obtain similar results to
sophisticated spatio-temporal models for single streams, while the score-level
fusion of two different streams outperforms state-of-the-art methods. Our
findings suggest that specific modifications in the preprocessing and training
process result in noticeable differences in the performance of the models and
could hide the actual originally attributed to the use of different neural
network architectures.Comment: 5 page
Audio-Based Classification of Respiratory Diseases using Advanced Signal Processing and Machine Learning for Assistive Diagnosis Support
In global healthcare, respiratory diseases are a leading cause of mortality,
underscoring the need for rapid and accurate diagnostics. To advance rapid
screening techniques via auscultation, our research focuses on employing one of
the largest publicly available medical database of respiratory sounds to train
multiple machine learning models able to classify different health conditions.
Our method combines Empirical Mode Decomposition (EMD) and spectral analysis to
extract physiologically relevant biosignals from acoustic data, closely tied to
cardiovascular and respiratory patterns, making our approach apart in its
departure from conventional audio feature extraction practices. We use Power
Spectral Density analysis and filtering techniques to select Intrinsic Mode
Functions (IMFs) strongly correlated with underlying physiological phenomena.
These biosignals undergo a comprehensive feature extraction process for
predictive modeling. Initially, we deploy a binary classification model that
demonstrates a balanced accuracy of 87% in distinguishing between healthy and
diseased individuals. Subsequently, we employ a six-class classification model
that achieves a balanced accuracy of 72% in diagnosing specific respiratory
conditions like pneumonia and chronic obstructive pulmonary disease (COPD). For
the first time, we also introduce regression models that estimate age and body
mass index (BMI) based solely on acoustic data, as well as a model for gender
classification. Our findings underscore the potential of this approach to
significantly enhance assistive and remote diagnostic capabilities.Comment: 5 pages, 2 figures, 3 tables, Conference pape
Natural course of septo-optic dysplasia: Retrospective analysis of 20 cases
Introducción. La displasia septoóptica (DSO) es la combinación variable de signos de disgenesia de línea media cerebral, hipoplasia de nervios ópticos y disfunción hipotálamo-hipofisaria, asociándose, a veces, con un espectro variado de malformaciones de la corteza cerebral. Objetivo. Describir la evolución natural y los hallazgos de neuroimagen en una serie de 20 pacientes diagnosticados. Pacientes y métodos. Se revisan de forma retrospectiva las características epidemiológicas, clínicas y neurroradiológicas de 20 pacientes consecutivos diagnosticados de DSO entre enero de 1985 y enero de 2010. Se analizaron los datos de tomografía computarizada, resonancia magnética craneal, electroencefalograma, potenciales evocados visuales, valoración oftalmológica, cariotipo y estudio endocrinológico. En siete pacientes, se realizó estudio del gen Homeobox HESX1. Resultados. El 60% de los casos presentaba antecedentes patológicos en el primer trimestre de gestación, con las ecografías fetales normales. Clínicamente, destacaban manifestaciones visuales (85%), alteraciones endocrinas (50%), retraso mental (60%) y crisis epilépticas (55%). Un 55% se asociaba a anomalías de migración neuronal. En un 45%, la DSO era el único hallazgo de neuroimagen. Se realizó cariotipo a todos, siendo normal. El gen HESX1 fue positivo en dos de los siete casos estudiados (ambos con DSO aislada). Ninguno con mutación en el gen HESX1 presentaba consanguinidad familiar. No se realizó estudio genético a los padres. Conclusiones. La DSO debe clasificarse como un síndrome malformativo heterogéneo, que asocia múltiples anomalías cerebrales, oculares, endocrinas y sistémicas. Las formas más graves se asocian con anomalías de la migración neuronal y de la organización cortical (AU)Introduction. Septo-optic dysplasia (SOD) is the variable combination of signs of dysgenesis of the midline of the brain, hypoplasia of the optic nerves and hypothalamus-pituitary dysfunction, which is sometimes associated with a varied spectrum of malformations of the cerebral cortex. Aims. To describe the natural history and neuroimaging findings in a series of 20 diagnosed patients. Patients and methods. We review the epidemiological, clinical and neuroimaging characteristics of 20 consecutive patients diagnosed with SOD between January 1985 and January 2010. Data obtained from computerised tomography, magnetic resonance imaging of the head, electroencephalogram, visual evoked potentials, ophthalmological evaluation, karyotyping and endocrinological studies were analysed. In seven patients, a study of the gene Homeobox HESX1 was conducted. Results. Pathological antecedents in the first three months of gestation were presented by 60% of the cases, with normal results in the foetal ultrasound scans. Clinically, the most striking features were visual manifestations (85%), endocrine disorders (50%), mental retardation (60%) and epileptic seizures (55%). Fifty-five per cent were associated to abnormal neuronal migration. In 45%, SOD was the only finding in the neuroimaging scans. Karyotyping was performed in all cases, the results being normal. Gene HESX1 was positive in two of the seven cases studied (both with isolated SOD). None of those with mutation in gene HESX1 presented familial consanguinity. No gene study was conducted with the parents. Conclusions. SOD must be classified as a heterogeneous malformation syndrome, which is associated to multiple brain, ocular, endocrine and systemic anomalies. The most severe forms are associated with abnormal neuronal migration and cortical organisation (AU
Introducing VTT-ConIot: A Realistic Dataset for Activity Recognition of Construction Workers Using IMU Devices
Sustainable work aims at improving working conditions to allow workers to effectively extend their working life. In this context, occupational safety and well-being are major concerns, especially in labor-intensive fields, such as construction-related work. Internet of Things and wearable sensors provide for unobtrusive technology that could enhance safety using human activity recognition techniques, and has the potential of improving work conditions and health. However, the research community lacks commonly used standard datasets that provide for realistic and variating activities from multiple users. In this article, our contributions are threefold. First, we present VTT-ConIoT, a new publicly available dataset for the evaluation of HAR from inertial sensors in professional construction settings. The dataset, which contains data from 13 users and 16 different activities, is collected from three different wearable sensor locations.Second, we provide a benchmark baseline for human activity recognition that shows a classification accuracy of up to 89% for a six class setup and up to 78% for a sixteen class more granular one. Finally, we show an analysis of the representativity and usefulness of the dataset by comparing it with data collected in a pilot study made in a real construction environment with real workers
- …